CMB power spectrum estimation using wavelets 
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Observations of the Cosmic Microwave Background (CMB) provide increasingly accurate infor- 
mation about the structure of the Universe at the recombination epoch. Most of this information 
is encoded in the angular power spectrum of the CMB. The aim of this work is to propose a 
versatile and powerful method for spectral estimation on the sphere which can easily deal with 
non-stationarity, foregrounds and multiple experiments with various specifications. In this paper, 
we use needlets (wavelets) on the sphere to construct natural and efficient spectral estimators for 
partially observed and beamed CMB with non stationary noise. In the case of a single experiment, 
we compare this method with Pseudo-C^ methods. The performance of the needlet spectral esti- 
mators (NSE) compares very favorably to the best Pseudo-C^ estimators, over the whole multipole 
range. On simulations with a simple model (CMB + uncorrelated noise with known variance per 
pixel + mask), they perform uniformly better. Their distinctive ability to aggregate many different 
experiments, to control the propagation of errors and to produce a single wide-band error bars is 
highlighted. The needlet spectral estimator is a powerful, tunable tool which is very well suited to 
angular power spectrum estimation of spherical data such as incomplete and noisy CMB maps. 



Introduction 

The estimation of the temperature and polarization 
angular power spectra of the Cosmic Microwave Back- 
ground (CMB) is a key step for estimating the cosmolog- 
ical parameters. Cosmological information is encoded in 
the huge data sets (time order scanning data or high res- 
olution maps) provided by ground-based, balloon-borne 
or satellite experiments. 

In the ideal case of noiseless and full sky experiments, 
angular power spectrum estimation is a straightforward 
task. The empirical spectrum of the outcome of a Gaus- 
sian stationary field X, given by 

^=2ZTl E <^»^) 2 . (!) 

m=— I 

where (Yg m ) denote the usual spherical harmonics, also is 
the maximum likelihood estimator of the power spectrum 
of X. It is efficient in the sense that its variance reaches 
the Cramer-Rao lower bound. 

CMB maps are however more or less strongly contami- 
nated by foregrounds and instrumental noises, depending 
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on the wavelength, angular frequency £ and the direction 
of observation. Ground-based experiments cover small 
parts of the sky while space missions (COBE, W-MAP 
and, in the near future, Planck) provide full sky maps 
of the CMB, but still contaminated with galactic resid- 
uals. Then, the plain estimate is no longer efficient 
nor even unbiased. To circumvent the non-stationarity of 
actual observations, the main ingredients for the spectral 
estimation used by, for instance, the W-MAP collabora- 
tion T4j 7] and also in most other analysis, are broadly 
the following ones. Usually, some part of the covered 
sky is blanked to remove the most emissive foregrounds 
or the most noisy measurements. This amounts to ap- 
plying a mask or more generally a weight function to 
the sky. Most of the emissive foregrounds can be sub- 
tracted using some component separation procedure (see 
e.g. [18 for comparison methods with Planck-like simu- 
lated data). Even the best foreground-subtracting maps 
require masking a small fraction of the sky. Missing or 
masked data makes the optimal estimation of the power 
spectrum a much harder task. In particular, it breaks 
the diagonal structure of the covariance of the multipole 
moments a£ m := (X, Yg m ) of any stationary component. 
Maximum-likelihood estimation of the spectrum in the 
pixel domain has a numerical complexity that scales as 
7Vp ix and requires the storage of a 7Vp ix matrices. This is 
untractable for high resolution experiments such as W- 
MAP or Planck (7V pix ~ 13. 10 6 ). Nevertheless, for very 
low €s (£ < 30), ML estimation in the pixel domain can 
be performed on downgraded resolution maps; see [5j[30], 
for instance. At higher £'s, a sub-optimal method based 
on the Pseudo-Ci (PCL) gives quite satisfactory results 
in terms of complexity and accuracy [17 j. It debiases 
the empirical or (pseudo) spectrum from the noise con- 
tribution and deconvolves it from the average mask ef- 
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feet. It works in the spherical harmonic domain, uses fast 
spherical harmonic (SH) transforms and scales as A^V 2 . 
The available pixels can be weighted according to the sig- 
nal to noise ratio (SNR) at any given point. For signal- 
dominated frequencies (low Ts), the data are uniformly 
weighted; it yields the Pseudo-Q estimator with uniform 
weights (PCLU). At noise-dominated frequencies (high 
£'s), each pixel is weighted by the inverse of the variance 
of the noise (PCLW estimator). The W-MAP collabo- 
ration used uniform weights for £ < 500, the inverse of 
the noise variance for £ > 500 (see [I4j Section 7.5]) for 
its three-year release. Efstathiou (2004) showed that the 
PCLW estimator is statistically equivalent to the ML es- 
timator in the low SNR limit, which is usually the case 
at high €s. In the same paper he proposed an hybrid 
method with a smooth transition between the two PCL 
regimes. Finally, when several maps are available, it is 
worth considering cross-power spectra between different 
channels since noise is usually uncorrelated from channel 
to channel (see [T5l Al.l] or [25]). 

Other estimation procedures do not fit in any of the 
two categories above. Among them, the spectral estima- 
tion from time ordered data by [31] or Gibbs sampling 
and Monte Carlo Markov chain methods such as MAGIC 
or Commander, see [ID] . Those last methods try to es- 
timate the complete posterior joint probability distribu- 
tion of the power spectrum through sampling, which in 
turn can provide point estimates of the spectrum but also 
covariance estimates, etc. Recently, the multi-taper ap- 
proach has been imported from the time series literature 
to the field of spherical data by [6j |32] . The goal of this 
approach is to provide an estimation of a localized power 
spectrum, in a noiseless experiment. 

In this paper, we focus on spectral estimation of the 
global power spectrum, in a frequentist framework. We 
consider spectral estimation at small angular scales, i.e. 
in the range of multipoles where the cost of ML esti- 
mation is prohibitive. We compare our method to PCL 
methods. We adopt somehow realistic models that in- 
clude partial coverage of the sky, symmetric beam con- 
volution, inhomogeneous and uncorrelated additive pixel 
noise and multiple experiments. 

Localized analysis functions such as wavelets are nat- 
ural tools to tackle non-stationarity and missing data is- 
sues. There are different ways to define wavelets (in the 
broad sense of space-frequency objects) on the sphere, 
and our choice is to use the needlets, the statistical prop- 
erties of which have received a recent rigorous treatment 
(HI HI [3]) and which have already been applied success- 
fully to cosmology f [8l [T9l 124] ) . 

Needlets benefit from perfect (and freely adjustable) 
localization in the spherical harmonic domain, which en- 
ables their use for spectral estimation. Moreover, the 
correlation between needlet coefficients centered on two 
fixed directions of the sky vanishes as the scale goes to in- 
finity, i.e. as the needlet concentrate around those points. 
The spatial localization is excellent. This property leads 
to several convergence results and motivates procedures 



based on the approximation of decorrelation between co- 
efficients. In this contribution, we define and study a new 
angular power spectrum estimator that uses the prop- 
erty of localization of the wavelets in both spatial and 
frequency domains. 

In the case of a single experiment with partial coverage 
and inhomogeneous noise, the needlet-based estimator 
deals straightforwardly with the variation of noise level 
over the sky, taking advantage of their localization in the 
pixel domain. Moreover, it allows a joint spectral esti- 
mation from multiple experiments with different cover- 
ages, different beams and different noise levels. The pro- 
posed method mixes observations from all experiments 
with spatially varying weights to take into account the 
local noise levels. The resulting spectral estimator some- 
how mimics the maximum likelihood estimator based on 
all the experiments. 

The paper is organized as follows. In Section [I] we 
present the observation model and recall the basics of 
needlet analysis and the properties of the needlet coef- 
ficients which are the most relevant for spectral estima- 
tion. In Section [il] we define the needlet spectral estima- 
tors (NSE) in the single-map and multiple-maps frame- 
works. In Section [III] we present results of Monte Carlo 
experiments which demonstrate the effectiveness of our 
approach. In Section |IV[ we summarize the strong and 
weak points of our method and outline the remaining 
difficulties. 



I. FRAMEWORK 

A. Observation model 

Let T denote the temperature anisotropy of the CMB 
emission. For the sake of simplicity, we consider the fol- 
lowing observation model 

x{£k) = w{i k ) ((B * r)(&) + <T(&)z fc ) , 

fc = l,...,JV pix (2) 

where is a collection of pixels on the sphere, W de- 
notes a (O-l)-mask or any weight function < W < 1, 
B denotes the instrumental beam. An additive instru- 
mental noise is modelled by the term <j(^/ c )Z/ c with the 
assumption that (Zk) is an independent standard Gaus- 
sian sequence. Further, we assume that cr, W and B 
are known deterministic functions and that B is axisym- 
metric. Typically, the variance map a 2 writes cr 2 (^) = 
0"o/^obs(£fc)> where iV bs is referred to as the hit map, 
that is, AT bs(6e) is the number of times a pixel in direc- 
tion £fc is seen by the instrument. We assume that the ob- 
servations have been cleaned from foreground emissions 
or that those emissions are present but negligible out- 
side the masked region. When observations from several 
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experiments are jointly considered, the model becomes 

/c = l,...,7V pix , e = l,...,E (3) 

where e indexes the experiment. The CMB sky temper- 
ature T is the same for all experiments but the instru- 
mental characteristics (beam, coverage) differ (see for ex- 
ample Table |TT|) , and the respective noises can usually be 
considered as independent. 



B. Definition and implementation of a needlet 
analysis 

We recall here the construction and practical compu- 
tation of the needlet coefficients. Details can be found in 
Guilloux et al. (2007; see also Narcowich et al, 2006). 

Needlets are based in a decomposition of the spectral 
domain in bands or 'scales' which are traditionally index 
by an integer j. Let bP be a collection of window func- 
tions in the multipole domain, with maximal frequen- 



cies 



(see Figure [l] below). Consider some pixeliza- 



tion points £p\ k = 1, . . . , iV^, associated with positive 
weights \P , k 

integration (quadrature) for spherical harmonics up to 

(i) 

degree 2£max, that is, equality 



1, . . . , iV^ which enable exact discrete 



Is 



N 



U) 



k=l 



holds for any £, m such that £ < 2-^max, \m\ < £. Needlets 
are the axisymmetric functions defined by 



4%) 



4!L 



by>L e (t-e k j> ), 



(4) 



where denote the Legendre polynomial of order £ nor- 
malized according to the condition Li(l) = ^f^ 1 - For 

proper choices of window functions {bP}j, the family 

{ ipP | is a frame on the Hilbert space of Square- 
ly J k,j 

integrable functions on the sphere S. In a 5-adic scheme, 
it is even a tight frame [22]. Though redundant, tight 
frames are complete sets which have many properties 
reminiscent of orthonormal bases (see e.g. [7], chap. 3). 

For any field X on the sphere, the coefficients 7^ := 

(A^) _1 / 2 (X, ipP) are easily computed in the spherical 
harmonic domain as made explicit by the following dia- 
gram 



i X (€k)} k =l,...,N pix 

K'>L, „<, 



SHT 



4 X 



(5) 



Double arrows denotes as many operations (e.g. spheri- 
cal transforms) as bands. The initial resolution must be 
fine enough to allow an exact computation of the Spher- 

(i) 

ical Harmonics Transform up to degree Gax. If, say, the 
HEALPix pixelization is used, the nside parameter of 
the original map determines the highest available mul- 
tipole moments and, in turn, the highest available band 
3- 



C. Distribution of the needlet coefficients 

A square-integrable random process X on the sphere 
is said to be centered and stationary (or isotropic) if 
E(X(0) = 0, E(X(0 2 ) < 00 and E(X(£)X(£')) = 
(47r) _1 CiLi(£ • £')> w ^ n Ct referred to as the an- 
gular power spectrum of X. The next proposition sum- 
marizes the first and second order statistical properties 
of the needlet coefficients of such a process. They are 
the building blocks for any subsequent spectral analysis 
using needlets. 

Proposition 1 Suppose that X is a stationary and cen- 
tered random field with power spectrum Cg. Then the 
needlet coefficients are centered random variables and, 
for any 4-tuple (j,j',k,k') 



cov[ 7 i j) ,7^ } ] = 5>W ] CeL £ (cosO) (6) 

where 6 = 6(j,k,j f ,k f ) is the angular distance between 
zular 

hl j) ] = . (7) 



%P and £y ^ . In particular 



where 



C (j) (47r) -i J2 (b { p^ {21 + l)C t . 



(8) 



£>0 



In other words, the variance of the coefficients 7^ is the 
power spectrum of X properly integrated over the j-th 
band. 

Remark 1 It also follows from ^ that if the bands j 
and j' are non- overlapping (this is the case for any non- 
consecutive filters of Figure [71) ; all the pairs of needlet 

coefficients 7^ ' and 7^ ; are uncorrected and then inde- 
pendent if the field is moreover Gaussian. 



Suppose now that 



X{£ k ) = *(Z k )Z k , fc = l, 



pix? 



is a collection of independent random variables with zero 
mean and variance <r 2 (£/c), where a is a band-limited 
function. This is a convenient and widely used model 
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for residual instrumental noise (uncorrelated, but non- 
stationary). Needlet coefficients are centered and, more- 
over, if the quadrature weights are approximately uni- 
form (Afc c± 47r/7Vpi x , as is the case of HEALPix) and a 
is sufficiently smooth, then 



We denote 



1/2 



(9) 



the standard deviation of the needlet coefficient of scale 
j centered on When the noise is homogeneous (a is 



constant), it reduces to X^>o(^ + 1) (b[ 



D. Mask and beam effects 

As already noticed in the Introduction, missing or 
masked data makes the angular power spectrum esti- 
mation a non trivial task. Simple operations in Fourier 
space such as debeaming become tricky. Needlets are 
also affected by both the mask and the beam. The effect 
on needlets of beam and mask can be approximated as 
described below. These approximations, which lead to 
simple implementations, are validated in numerical sim- 
ulations in relatively realistic conditions in Section [TTTj 

a. Mask. Recalling that the needlets are spatially 
localized, the needlet coefficients are expected to be in- 
sensitive to the application of a mask on the data if they 
are computed far away from its edges. Numerical and 
theoretical studies of this property can be found in [2] 
and [13]. In practice, we choose to quantify the effect of 
the mask on a single coefficient 7^ by the loss induced 

on the L 2 -norm of the needlet ^\ i-e. a purely geomet- 
rical criterion. More specifically, needlet coefficients at 
scale j are deemed reliable (at level t^) if they belong 
to the set 



JC U) - 



k = 1, 



Hi 



> fk') } (10) 



Parameter is typically set to 0.99 or 0.95 for all bands. 
Note that i-> jc[ j) is decreasing, JC ( j) = and 

/C^+ = 0. In practice, this set is computed by threshold- 
ing the map obtained by the convolution of the mask with 

the axisymmetric kernel £ 1— > (E^Mcosfl)) . This 

operation is easy to implement in the multipole domain. 

b. Beam. Consider now the effect of the instrumen- 
tal beam. Its transfer function Bg is assumed smooth 
enough that it can be approximated in the band j by its 
mean value B^) in this band, defined by 

{B^f := (47T)" 1 £(2£+ ) 2 B|. (11) 

£>0 



In the following, the beam effect for spectral estimation 
is taken into account in each band. Indeed, with Defini- 



tion (11), with no noise, no mask and a smooth beam, 
Eq. ([7]) translates to 



var 



B(J) 



CM ,k = l,...,Ni 



U) 



(12) 



In other words, thanks to the relative narrowness of 
the bands and to the smoothness of the beam and 
CMB spectrum, the attenuation induced by the beam 
can be approximated as acting uniformly in each band 
and not on individual multipoles. Numerically, with 
typical beam values from WMAP or ACBAR experi- 
ments (see Table [n|, the relative difference (statistical 
bias) between the goal quantity and the estimated 
one var(7^)/(£ (j) ) 2 remains under 1% for bands be- 
low j = 27 (£ max = 875) for WMAP-W, and below 
j = 39,Cax = 2000 for ACBAR. 



II. THE NEEDLET SPECTRAL ESTIMATORS 
(NSE) 

A. Smooth spectral estimates from a single map 

For any sequence of weights such that 



(j) 



J2k=i = 1 and for a clean (contamination- free), 

complete (full-sky) and non-convolved (beam-free) ob- 
servation of the CMB, the quantity 



N 



U) 



fc=l 



is an unbiased estimate of a direct consequence of 
Proposition [I] 

Remark 2 For uniform weights, this estimator is noth- 
ing but the estimator Cg from Eq. binned by the win- 
dow function (bP) 2 . Indeed, (see diagram ^) 



C«)=(4 7 r)- 1 ^(6f) 2 £ «L (13) 

£>0 m=-£ 

=(4^)- 1 ^(^' ) ) 2 (2£ + l)Q (14) 



£>0 



This is the uniformly minimum variance unbiased es- 
timator of The so-called cosmic variance is the 
Cramer-Rao lower bound for estimation of the parameter 
Ci in the full-sky, noise-free case. Its expression simply 
is 2C 2 /(2£ + 1). Its counterpart for the binned estimator 
in this ideal context is 



^Lc = 2(47r)- 2 ^(^ ) ) 4 (2^+l)C| 



(15) 
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Consider now the observation model Up to the ap- 
proximations of Section [TdI one finds that 



E 



( BU) Y ^ 



y-wy 



is an unbiased estimate of as soon as 



E 



i. 



(16) 



(17) 



keK, 



The weights can further be chosen to minimize the mean- 
square error E — C^^j , under the constraint (17). 

It amounts to setting the weights according to theTo- 
cal signal-to-noise ratio, which is non constant for non 
stationary noise. This is a distinctive advantage of our 
method that it allows for such a weighting in a straight- 
forward and natural manner. In the case of uncorrelated 
coefficients, this optimization problem is easy to solve 
(using Lagrange multipliers) and is equivalent to maxi- 
mizing the likelihood under the approximation of inde- 
pendent coefficients (see Appendix[C|for details). It leads 
to the solution 



=M^)T [ e p+^y 



(18) 



k'eJC\- 



with C = This is the unknown quantity to be 

estimated but it can be replaced by some preliminary es- 
timate (for example the spectral estimate of [14]). One 
can also iterate the estimation procedure from any start- 
ing point. The robustness of this method with respect 
to the prior spectrum is demonstrated at Section [ill A 2| 
(see Figure [6]). 

Those weights are derived under the simplifying as- 
sumption of independence of needlet coefficients. They 
can be used in practice because needlet coefficients are 
only weakly dependent. Precisely, for two fixed points 
on a increasingly fine grid , and well-chosen window 
functions, the needlets coefficients (7^,7^) are asymp- 
totically independent as j — > 00 (see [2]). Note that this 
property is shared by well-known Mexican Hat wavelets, 
as proved in [21]. 



B. Smooth spectral estimates from multiple 
experiments 

Consider now the observation model described by 
Eq. ([3]) with noise independent bet ween experiments. Us- 
ing the approximations of Section I D , Eq. ([3| translates 
to 



7* W) CT) + 



n 



CO 



(cr e 



ZkA 



(19) 



in the needlet domain, for indexes k G /C^. , where Z^ e 
are standard Gaussian random variables which are cor- 
related within the same experiment e but independent 
between experiments. As explained in the single experi- 
ment case of Section al A[ the coefficients are only slightly 
correlated. This justifies the use, in the angular power 
spectrum estimator, of the weights derived by the max- 
imization of the likelihood with independent variables. 
The correlation between coefficients does not introduce 
any bias here but only causes loss of efficiency. As in the 
single-experiment case, only the coefficients sufficiently 
far away from the mask of at least one experiment are 
kept. Defining 

/C« = U e /C«. 

the aggregated estimator is implicitly defined (see Ap- 
pendix [Cl) by 



with, for any k in /C^ 



= E< 



(j) lk,e 



B, 



k,e ' 




(20) 
(21) 
(22) 
(23) 



and similarly to (18) 



c 



E 

fe'e/co) 



C 



(24) 



Note that £ e 



°k,e 



1 and Y,k ™^ 



1. An explicit es- 



timator is obtained by plugging some previous, possibly 

rough, estimate C ij) of in place of C of Eq. ([24). 
Eventually, the aggregated angular power spectrum esti- 
mator is taken as 



C<3) 



E 



(25) 



This expression can be interpreted in the following way. 
For any pixel k in that is for any pixel where the 

needlet coefficient is reasonably uncontaminated by the 
mask for at least one experiment, compute an aggregated 



needlet coefficient 7^ by the convex combination (|21|) of 
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the debeamed needlet coefficients from all available ex- 
periments. Weights of the combination are computed 
according to the relative local signal to noise ratio (in- 
cluding the beam attenuation). Finally, a spectral es- 
timation is performed on the single map of aggregated 

coefficients, in the same way as in Section III A| Those co- 

( i 

efficients are squared and translated by ftp to provide an 
unbiased estimate of Then all the available squared 
and debiased coefficients are linearly combined according 
to their relative reliability w^\C) which is proportional 



to (CW + (4 J) ) 2 ) _1 - Figures 



13 



and 
and uj 



ues of those weights (maps) w 

mixing of experiments. See Section [III B| for details. 



display the val- 
k e for a particular 



Parameters of the method 



(1960) for the PSWF in R and [El EE] for PSWF on the 
sphere. In our simulations, we use PSWF needlet s since 
they are well localized and easy to compute. Other cri- 
teria and needlets can be investigated and optimized, at 
least numerically; see [13 for details. The choice of the 
optimal window function in a given band is a non trivial 
problem which involves the spectrum itself, the charac- 
teristics of the noise and the geometry of the mask. Even 
if we restrict to PSWF as we do here, it is not clear how to 
choose the optimal opening 6^ for each band j. We can 
use several rules of thumb based on approximate scaling 
relation between roughly £?-adic bands and openings 0^ 
that preserve some Heisenberg product or Shannon num- 
ber. Figure [I] represents three families of PSWF needlets 
that are numerically compared below. Their spatial con- 
centration is illustrated by Figure [2j 



In this section, we discuss various issues raised by the 
choice of the parameters of the NSE method. Those pa- 
rameters are: the shape of the spectral window func- 
tion b^p in each band (or equivalently the shape of the 
needlet itself in the spatial domain), the bands them- 
selves {i.e. the spectral support of each needlet) and the 
values of the thresholds tj that define the regions of the 
sky where needlets coeffic ients ar e trusted in each band; 
see Eq. (10). See Section 
gation. 



Ill A 2 



for a numerical investi- 



1. Width and shape of the window functions 

For spectral estimation, it is advisable to consider spec- 
tral window functions with relatively narrow spectral 
support, in order to reduce bias in the spectral estima- 
tion. The span of the summation in Q can be fixed to 

some interval [-^ 5 ^mL 

]. For our illustrations, the inter- 
val bands have been chosen to cover the range of available 
multipoles with more bands around the expected posi- 
tions of the peaks of the CMB. The bands are described 
in Table H 

It is well known however that perfect spectral and spa- 
tial localization cannot be achieved simultaneously (call 
it the uncertainty principle). In order to reduce the effect 
of the mask, we have to check that the analysis kernels 
are well localized. This leads to the optimization of some 
localization criteria. If we retain the best L 2 concentra- 
tion in a polar cap n eU ) = {£:#< 0^)}, namely 



I, 



arg max - 

b 



(26) 



we obtain the analogous of prolate spheroidal wave func- 
tion (PSWF) thoroughly studied in e.g. Slepian & Pollak 



2. The choice of the needlets coefficients (mask) 

Practically we want to keep as much information (i.e. 
as many needlet coefficients) as possible, and to minimize 
the effect of the mask. Using all the needlet coefficients 
regardless of the mask would lead to a biased estimate 
of the spectrum. It is still true if we keep all the co- 
efficients outside but still close to the mask, keeping in 
mind that the needlets are not perfectly localized. On 
the other hand, getting rid of unreliable coefficients re- 
duces the bias, but increases the variance. This classical 
trade-off is taken by choosing the threshold level in 
the Definition (10) of the excluding zones. For multiple 



experiments, a different selection rule can be applied to 
each experiment, according to the geometry of the mask 
and the characteristics of the beam and the noise. 



III. MONTE CARLO STUDIES 

Recall that NSE spectral estimators are designed based 
on three approximations: 

• one can neglect the impact of the mask on the 
needlet coefficients which are centered far enough 
from its edges; 

• one can neglect the variations of the beam and the 
CMB power spectrum over each band. 

• the weights, which are optimal under the simplify- 
ing assumption of independent needlet coefficients, 
still provide good estimates for the truly weakly 
correlated needlet coefficients. 

We carry out Monte Carlo studies to investigate, first 
the actual performance of the method on realistic data, 
and second the sensitivity of the method with respect to 
its parameters. Stochastic convergence results under ap- 
propriate conditions is established in a companion paper 

EE- 
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FIG. 1: Four families of window functions that are used for the NSE and compared numerically in Section [HI A 2| There are 
three families of prolate spheroidal wave functions and one family of top-hat functions. All the families are defined on the 
same bands. Inset graphs show the window function in the 26th band. Each window function is normalized by the relation 
(47TT 1 J2( b £ j) ) 2 ( 2i + 1) = 1. Then, if the angular power spectrum is flat, Cg = Co, then = Co for all bands, according 
to 




FIG. 2: Angular profile of the four needlets associated to the window functions at the 26th band (701 < £ < 800) from the four 
families of Figure [I] We have plotted the axisymmetric profile ^ g hgLt (cos 9) as a function of 0. Needlets with "smoother" 
associated window profile (such as Prolate 3) need more room to get well localized, but are less bouncing than needlets with 
abrupt window function (such as top hat or Prolate 1) 
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Band (j) 


1 


2 


3 


4 


5 




20 


21 


22 


23 


24 


25 


26 


27 


28 


29 




35 


36 


37 


38 


39 


f 

min 

-'-max 


2 

20 


11 

30 


21 
40 


31 
50 


41 
60 




401 
500 


451 
550 


501 
600 


551 
650 


601 
700 


651 
750 


701 
800 


751 
875 


801 
950 


876 
1025 




1326 
1475 


1426 
1625 


1501 
1750 


1626 
1875 


1751 
2000 


nside^ 


16 


16 


32 


32 


32 




256 


512 


512 


512 


512 


512 


512 


512 


512 


1024 




1024 


1024 


1024 


1024 


1024 


nU) 


69 


50 


41 


36 


32 




10.7 


10.2 


9.7 


9.3 


8.9 


8.6 


8.3 


8.0 


7.7 


7.4 




6.1 


5.8 


5.6 


5.4 


5.2 



TABLE I: Spectral bands used for the needlet decomposition in this analysis. Depending on ^mL, the needlet coefficient maps 
are computed using the HEALPix package, at different values of nside, given in the fourth line. The number of needlet 
coefficients in band j is then 12(nside^) 2 . It is a ki nd of decimated implementation of the needlet transform. T he last l ine 
gives the opening 6^ (in degrees) chosen in Eq. 26 



details). 



to define the PSWF from the Prolate 2 family (see Section 



III A 2 



for 



A. Single map with a mask and inhomogeneous 
noise 

In this section, we first consider model ([2|. According 
to Eqs ( 12 ) and ( 16 ), any beam can be taken into account 
easily in the procedure. Without loss of generality, we 
suppose here that there is no beam (or B is the Dirac 
function). The case of different beams is addressed in 
Section [ilTBl see Table [ill 



The key elements for this numerical experiment are il- 
lustrated by Fig. [3] We simulate CMB from the spectrum 
Ci given by the ACDM model that best fits the W-MAP 
data. We use a KpO cut [4] for the mask and we take a 
simple non homogeneous noise standard deviation map 
(the SNR per pixel is 1.5 in two small circular patches 
and 0.4 elsewhere). 

Using a mean-square error criterion, we first study the 
dependence of NSE performance on its free parameters. 
Then we compare NSE with methods based on Spheri- 
cal Harmonic coefficients, known as pseudo-C^ estimation 
and followed in Hinshaw et al. (2006)). For the reader's 
convenience, the PCL procedure is summarized in Ap- 
pendix [Al 



intrinsic to the whole experiment. When the sky is par- 
tially observed (let / s k y denote the fraction of available 
sky) and for high ts (or j's), the cosmic variance must 
be divided by a factor / s k y leading to the following ap- 
proximate Cramer-Rao lower bound at high frequencies 



v U) , =n 1 v U) . . 

sample J sky cosmic 



(27) 



Including an homogeneous additive uncorrelated pixel 
noise with variance a 2 , the sample variance writes 



47T 



In a non-homogeneous context, no close expression for 
the sampling variance is available: Eq. (27) will serve as 
one reference. When comparing different window func- 
tions in the same band, it must be kept in mind that dif- 
ferent estimators do not estimate the same so that 
the sampling variances are not the same. In this case, we 
use the following normalized MSE 



MSE(C^) 

J skv c 



0) 



(28) 



1. Mean-square error 

We shall measure the quality of any estimator C^of 
by its mean-square error 

MSE(C W) ) = E (d {j) - C (j) ) 2 . 

This expectation is estimated using 400 Monte Carlo 
replications. Roughly speaking, the MSE decomposes 
as an average estimation error and a sampling variance. 
The estimation error term is intrinsic to the method. Ide- 
ally, it should be used to compare the relative efficiency 
of concurrent approaches. The sampling variance term 
is the so-called cosmic variance. It is given by the char- 
acteristic of the spectrum and coming from the fact that 
we only have one CMB sky, and thus 21 + 1 a^ m 's to 
estimate one Cg. It is increased by the negative influ- 
ence of the noise and the mask. This gives an error term 



2. Robustness with respect to parameter choice 

This section looks into the robustness of NSE with re- 
spect to its free parameters. 

First and as expected, the spectral estimation is very 
sensitive to the choice of the window functions. Even if 
we restrict to the PSWF, one has the freedom to choose 
a concentration radius 0^ for each band. We com- 
pare the mean-square error of the estimation for various 
choices of 0^ that lead to three of the window func- 
tion families displayed in Figure [I] The second prolate 
family is obtained using the "rule of thumb" relation 
° U) = 2((^ m 'in + ^mL)/2)- 1 / 2 . The values of those open- 
ing angles are in Table [I] The first and third sequences of 
opening angles are the same with a multiplicative factor 
of 0.5 and 2, respectively. For the sake of comparison 
we also consider the top-hat window functions. Figure [4] 
shows the normalized MSE for those four "families" of 
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(a)Mask (b)Noise standard deviation (c)One simulated input map (d)Spectra 

FIG. 3: Simplified model of partially covered sky and inhomogeneous additive noise. This model is used to compare numerically 
the NSE estimator with PCL estimators and to assess the robustness or the sensitivity of the method with respect to its 
parameters. The mask is kpO. In CMB \iK units, the standard deviation of the uncorrelated pixel noise is 75 in the two small 
circular patches and 300 elsewhere. 
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FIG. 4: Comparison of the normalized MSE (28) of the 
needlet spectral estimators for the four families of spectral 
window functions displayed in Figure [I] The smoothness of 
the window function make the MSE smaller at high multi- 
poles. At low multipoles, taking a too smooth function makes 
the needlet less localized and there is a loss of variance due 
to the smaller number JC^ of needlet coefficients that are 
combined. 



needlets as a function of the band index. Notice the poor 
behavior of a non-optimized window function and the far 
better performance of the second prolate family in com- 
parison with top-hat and Prolate 1 windows. Thus, in the 
following, we use this particular needlet family to study 
the sensitivity of the method with respect to the other 
parameters, and to compare NSE and PCL estimators. 

Next, for the second family of PSWF, we compare the 
influence of the threshold value = t for t = 0.9, 0.95 
or 0.98. Figure [5] shows that this choice within reasonable 
values is not decisive in the results of the estimation pro- 
cedure. For very low frequencies (£ < 100), the conser- 
vative choice t = 0.98 increases the variance since many 
needlets are contaminated by the mask and discarded. 
Qualitatively, in such variance dominated regimes, taking 
more coefficients (e.g. t = 0.9) is adequate. However, we 
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FIG. 5: Relative difference between the normalized MSE (28) 
of the needlet spectral estimation using thresholds 0.9 and 
0.98, and the same with threshold 0.95. It highlights the fact 
that the estimation is not very sensitive to the value of this 
parameter, except at low -f s, where we do not advocate the 
use of the NSE. The window function family is "Prolate 2" 
from Figure [I] 



do not advocate the use of the NSE at low f s where ex- 
act maximum likelihood estimation is doable. At higher 
Fs, there is roughly no difference between the t = 0.95 
and t = 0.98 thresholds. 



Finally, we check the robustness of the method against 
an imprecise initial spectrum. We take 0.9C^ and 
l.lC^ as initial values and compare the results 

with the best possible initial value which is itself. 
The relative difference between the results, displayed in 
Figure |6j does not exceed 1%. 
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FIG. 6: Robustness of the NSE with respect to the ini- 
tial value given to the weights formula (18). We have 



performed the whole estimation with — 0.9C^ and 

qU) — 1.1(7^). This plot shows the relative difference be- 
tween the normalized MSE under those initial values and the 
normalized MSE under the optimal initial value = 



Pseudo-d versus needlet spectral estimator 



We compare the NSE estimator given by Eq. ( 16 ) with 
estimation based on the spherical Harmonic coefficients 
of the uniformly weighted map and of the l/a 2 -weighted 
map. The result is displayed in Fig. [7| 

As expected, at low multipoles, where the SNR is 
higher, the uniform pseudo-C^ estimator performs better 
than the weighted pseudo-C^ estimator and, conversely, 
at high multipoles where the SNR is lower. [9] proved 
that the equal-weights pseudo-C^ estimator is asymptot- 
ically Fisher-efficient when £ goes to infinity. The be- 
haviour of the needlet estimator is excellent: its per- 
formance is comparable to the best of the two previous 
methods both at low and high multipoles. Thus, there 
is no need to choose arbitrary boundaries between fre- 
quencies for switching for one weighting to the other. 
The NSE estimator automatically implements a smooth 
transition between the two regimes and it does so quasi- 
optimally according to noise and mask characteristics. 
At low €s one should optimize the window function (the 
characteristic angle of opening of the prolates) to broaden 
the range of optimality of the NSE. 

Providing a Ct estimate with error bars is often not suf- 
ficient. Estimate the covariance matrix of the whole vec- 
tor of spectral estimates is necessary for full error prop- 
agation towards, say, estimates of cosmological parame- 
ters. Figure [8] shows the values of the correlation matrix 
between the spectral estimates. In the idealistic case of 
a full sky noiseless experiment, the theoretical correla- 
tion matrix is tridiagonal because window functions we 



have chosen only overlap with their left and right nearest 
neighbours. The mask induces a spectral leakage, which 
is however reduced for the smoothest window function. 
This leakage is however compensated for by the selection 



of coefficients in /C^ (see Eq (|10|)). 



B. Aggregation of multiple experiments 

Historically in CMB anisotropy observations, no single 
instrument provides the best measurement everywhere on 
the sky, and for all possible scales. In the early 90's, the 
largest scales have been observed first by COBE-DMR, 
complemented by many ground-based and balloon-borne 
measurements at higher i. Similarly, ten years later, 
WMAP full sky observations on large and intermediate 
scales have been complemented by small scale, local ob- 
servations of the sky as those of Boomerang, Maxima, 
ACBAR or VSA. 

The joint exploitation of such observations has been so 
far very basic. The best power spectrum is obtained by 
choosing, for each scale, the best measurement available, 
and discarding the others. One could, alternatively, av- 
erage the measurements in some way, but the handling of 
errors is complicated in cases where a fraction of the sky 
is observed in common by more than one experiment. 

Clearly, the data is best used if some method is devised 
that allows combining such complementary observations 
in an optimal way. In this section we present the results 
of a Monte Carlo study to illustrate the benefits of our 
method of aggregated spectral estimator. 

We simulate observations following the model (3|, with 
E = 6 observed maps : 3 KpO-masked maps with beams 
and noise-level maps according to W-MAP experiment 
in bands Q, V and W respectively ; 3 maps with uniform 
noise, observed in patches the size of which are equiva- 
lent to BOOMERanG-Shallow, BOOMERanG-Deep and 
ACBAR observations respectively, and noise levels rep- 
resentative of the sensitivities of those experiments. Ta- 
ble [H] gives the key features of these experiments. Further 
details can be found in g] and [16] for W-MAP, [20] for 
BOOMERanG, [27] and [26] for ACBAR. However, we 
do not intend to produce fully-realistic simulations. Basi- 
cally, no foregrounds are included in simulations (neither 
diffuse nor point sources) ; for ACBAR only the 3 sky 
fields of year 2002 are used ; and for W-MAP only one 
detector is used for each band. 

Key elements for this numerical experiment are illus- 
trated in Fig. 9JT0JTT] where we have displayed respec- 
tively one random outcome of each experiment according 
to those simplified models, the maps of local noise levels 
and power spectra of the CMB and of the experiment's 
noise. 

Fig. 13 displays the maps of the weig htS 4^ ( the 
26th band is the multipole range 700 < t < 800). Ac- 
cording to Eq. (23), all those weights belong to [0,1] and 
for any fixed position, the sum of the weights over the 
six experiments is equal to one. Red regions indicates 
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FIG. 7: Comparison of the relative MSE (28) of the two PCL estimators (PCLU for flat weights, PCLW for inverse variance 
weights) with the NSE, for Prolate family 2. For 300 < £ < 1200, the NSE is uniformly better than the best of the two PCL 
methods. It should be noted that the NSE may be improved again by optimizing the window profiles and the thresholds t 
(e.g. by taking a lower threshold for low bands to reduce the variance, see Figure [5]). 
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FIG. 8: Absolute value of the correlation matrix of vector (C^\ . . . , C^ 32 ^), which entries are defined by Eq. (16). It has been 
estimated in the context of Fig. [3] using two different families of window functions and 400 Monte Carlo replicates. This shows 
the difference between a family of PSWF (left panel) and a family of top-hat windows (right panel). 



needlet coefficients which are far better observed in an 
experiment that in all others. Blue, light blue and or- 
ange region are increasing but moderately low weights, 
showing that outside the small patches of BOOMERanG 
and ACBAR, most information on band 26 is provided 
by the channel W of WMAP. On the patches, needlet co- 
efficients from W-MAP are numerically neglected in the 
combination (21 ). 

The debiased, squared, aggregated coefficients 



are displayed on the left map 



from Fig. 14 All those coefficients are approximately 
unbiased estimators of C^ 26 \ The map of weights w^ 1 ^ 

More weight 
by lower noise 



is displayed on the right of Figure [14 
is given to regions which are covered 
experiments. The final estimate is obtained by averaging 



the pixelwise multiplication of these two maps. 

Figure [15] shows the benefit of the aggregation of dif- 
ferent experiments, in comparison with separate estima- 
tions. In CMB literature, error bars from different exper- 
iments are usually plotted on a same graph with different 
colors. For easier reading, we plot the output of single 
experiment NSE in separate panels (a,b and c). Panel (d) 
shows the output of the aggregated NSE, which improve 
the best single experiment uniformly over the frequency 
range, thanks to the locally adaptive combination of in- 
formations from all expermiments. 



Figure 12 highlights the cross-correlation between sin- 
gle experiment estimators and the final aggregated esti- 
mator. It provides a complementary insight on the rel- 
ative weight of each experiment in the spectral domain. 
The W-MAP-like measures are decisive for lower bands, 
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(a) "True" CMB 



(b)WMAP-Q 



(c)WMAP-V 



(d)WMAP-W 




(e)BOOMERanG-S 



(f)BOOMERanG-D 



(g)ACBAR 




FIG. 9: Simulated observations from model |3| for the 6 experiments described in Table |IT} in a small patch around point 
(-40,-90). The approximate size of the patch is 38x38 degrees. 




i 



(a)WMAP-Q 



(b)WMAP-V 



(c)WMAP-W 




(d)BOOMERanG-S (e)BOOMERanG-D (f)ACBAR 

FIG. 10: Coverage and local pixel noise levels of the six simple experiments described in Table |ll| 



whereas BOOMERanG and ACBAR ones give estima- 
tors very much correlated to the aggregated one at higher 
bands. The aggregated NSE is eventually almost identi- 
cal to the estimator obtained from ACBAR alone. 



IV. DISCUSSION 



Complexity 



According to ([5|, the calculation of all the needlet co- 
efficients takes one SHT and j m ax inverse SHT, where 



k.e 



is the number of bands. The weights w 



U) ? r,(i) 
Mi 



and 



are obtained using simple operations on maps, so 
that the overall cost of the (aggregated) NSE scales as 

3 /2 

N & operations. This is comparable to the cost of the 
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Experiment 


Beam 


nside 


Noise level 


/sky 


W-MAP Q 


31' 


512 


Cjriven by tne 
hit map 


78.57% 


W-MAP V 


21' 


W-MAP W 


13' 


BOOM S 


10' 


1024 


17.5 fiK 


2.80% 


BOOM D 


5.2 fiK 


0.65% 


ACBAR 


5' 


2048 a 


14.5 iiK 


1.62% 



a We used nside=1024 for our Monte Carlo simulations, as going 
to Vax — 2000 is enough to discuss all the features of our method. 

TABLE II: Main parameters of the experiments to be aggre- 
gated. The beams are given in minutes of arc, nside refers to 
HEALPix resolution of the simulated maps, noise level is ei- 
ther a map computed from a hitmap and an overall noise level, 
or a uniform noise level per pixel (in fiK CMB). Numbers 
quoted here are indicative of the typical characteristics of ob- 
servations as those of W-MAP, BOOMERanG and ACBAR, 
and are used for illustrative purposes only. 




Multipole 

FIG. 11: Spectra of the beamed CMB (with the 
BOOMERanG lines overplotted) and noise levels (horizon- 
tal lines) seen by the six experiments, as if they were full sky 
(the /sky effect is not taken into account). 
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1000 
Multipole 



1500 



FIG. 12: Correlation between the aggregated estimator and 
single experiments estimators. This provides insight on the 
contribution of each experiment into the final aggregated sin- 
gle spectral estimate. 



Thus, an unbiased spectral estimator is given by 

C<jL = £ «P £ _1 7*27$ (29) 



(i) 

where the weights w y k ' depend on a preliminary estimate 
of the spectrum and a possibly imprecise estimate of the 
local and aggregated noise levels that enter in the vari- 
ance of 7^7^/- This has not been investigated numer- 
ically yet but we can conjecture the qualitative results 
of this approach: more robustness with respect to noise 
misspecification but greater error bars than the NSE with 
perfectly known noise levels. Moreover, adapting the 
procedure described in [25], one can test for noise mis- 
specification, and for the correct removal of the noise 
by considering the difference between the NSE and 



cross-spectrum NSE C< 



CO 



PCL methods. 



CONCLUSION 



Sensitivity to the noise knowledge 

To be unbiased, the above described estimators require 
a perfect knowledge of the noise characteristics, as do 
Pseudo-Q estimators. In both cases, the uncertainty 
on the noise can be tackled using cross-spectrum, that 
removes the noise on the average provided that the noises 
from each experiment are independent. Indeed, for any 
pixel k far enough from the masks of experiment e and 
e', e e' ', we have 

E[ 7 ^ e ,]=B^B^C^. 



We have presented some potentialities of the needlets 
on the sphere for the angular power spectrum estima- 
tion. This tool is versatile and allows to treat consis- 
tently the estimation from a single map or from multiple 
maps. There remains many ways of improving or modify 
the method described in Section ITT1 

In the future, it is likely that again complementary 
data sets will co-exist. This is the case, in particular, 
for polarisation, for which Planck will measure the large 
scale CMB power on large scales with moderate sensi- 
tivity, while ground-based experiments will measure very 
accurately polarisation on smaller scales. Extensions to 



14 





FIG. 13: Method for aggregating experiments: Weights uj\, J for combining the needlet coefficients from the 26th band 
(700 < £ < 800) and the six experiments. From left to right and top to bottom: W-MAP-Q, W-MAP-V, W-MAP-W, 
BOOMERanG-S, BOOMERang-D and ACBAR. 




FIG. 14: Method for aggregating experiments: On the left: map of debiased squares of aggregated needlet coefficients, in the 
26th band (700 < £ < 800). On the right: map of the weights affected to those coefficients to estimate the power spectrum. 



polarisation of the approach presented hers will likely be 
important for the best exploitation of such observations. 
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APPENDIX A: PSEUDO-Q ESTIMATORS 



Let T be a stationary process with power spectrum (C^)^>o, W an arbitrary weight function (or mask) and 

m=—£ 

the so-called pseudo-power spectrum of T with mask W. The ensemble- average of this quantity is related to the true 
power spectrum by the formula 

E(Ci) = Mw(W)Ci> 

where Mw(W) is the doubly-infinite coupling matrix associated with W, see [TTJ [23] . If U is a unit variance white 
pixel noise, denote by Vg = Ana 2 /N p [ x its "spectrum" (see Appendix |b|) . Consider now the model X = WiT + W2U. 



15 



iooo S 



true 
SpecFusion t- 





1000 

(a)ACBAR 



500 1000 1500 

(b)BOOMERanG 





true 
SpecFusion h 



500 1000 

(c)WMAP 



500 1000 1500 

(d)All aggregated 



FIG. 15: Results for the aggregated NSE. Error bars are estimated by 100 Monte Carlo simulations. The ACBAR power 
spectrum is computed using the single map needlet estimator described in Section [II A| whereas the BOOMERanG and W- 
MAP spectra are obtained using the aggregation of needlets coefficients from the two (BOOM-S and BOOM-D) and three 
(W-MAP Q,V,W) maps respectively. The final spectrum (d) is obtained by aggregating all available needlet coefficients from 
the six maps. 



Then, if Mm>(Wi) is full-rank, 



(AMWi))- 1 {Q(Wi) - M(W 2 )u>Vi>} 



is an unbiased estimator of Cg< . It is obtained by deconvolving and debiasing the empirical spectrum. The observation 
model Q with no beam coincides with the preceding framework with Wi = W and W2 = crW . This leads to the 
uniform- weights pseudo-Cg estimator (PCLU). One can also divide all the observations by a 2 , yielding to a similar 
scheme with Wi = cr~ 2 W and W2 = cr~ 1 W. This is the variance- weighted pseudo-Ci estimator (PCLW). Both 
are used by the W-MAP collaboration [13] . The uniform weights lead to better estimates in the high SNR regime 
(low ts) whereas the flat weights perform better at low SNR (high ts). [9 showed that the Pseudo-Q estimator is 
statistically equivalent to the maximum likelihood estimator asymptotically as t goes to infinity. He also proposed an 
implementation of a smooth transition between those two regimes. 
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FIG. 16: Mean-square error of the three single expermiment NSE estimators and of the aggregated NSE estimator. The 
2-sigmas error bars reflect the imprecision in the Monte Carlo estimation of the MSE of the aggregated NSE. Up to those 
uncertainties, the aggregated estimator is uniformly better than the best of all experiments. The improvement is decisive in 
"crossing" regions, where two expermiments perform comparably. The normalized MSE here is E(C^ — C^) 2 /(C^) 2 . 



APPENDIX B: WHAT "NOISE SPECTRUM" MEANS 



Let v denote the noise. It is defined on pixels and supposed centered, Gaussian, independent from pixel to pixel, 
and of variance <r 2 (£), i.e. 

Vk = cr(£k)U k , k = 1, . . . iVpix, 

with E/i, . . . , /7jv pix u ~ A/*(0, 1). Define V£ m := \kVkYim(tik) -> and call them (abusively) the "discretized" multipole 
moments of the noise, which do not have any continuous counterpart because v is not defined on the whole sphere. 
Define the corresponding discretized empirical spectrum N% := J2m v \m^ ^ nen 

k 

Ni = - ^\k^k>cr(£k)cr(£k')UkUk>Li((£k^k>}) 

k,k' 

and E(N e ) = ±- £ \ 2 k a 2 ^ k ) =: N e 

k 

This sequence Ng can be thought of as the pixel- noise spectrum. Note that if = k — l,...,iVpi x , then 

Ni = -j^— j <r 2 (£)d£. If the noise is moreover homogeneous, <r(£) = a, then E(z^ m z^/ m /) = jf^r5ifl8 m ^m' . 



APPENDIX C: VARIANCE ESTIMATION BY AGGREGATION OF EXPERIMENTS WITH 

INDEPENDENT HETEROSCEDASTIC NOISE 

Consider the model 

Yk,e = Xk + Tlk,eZk,e 

where X := [X k }ke[i,N pix ] and Z := [Z kl e](k,e)e[i,N pix ]x[i,E\ are independent, X k ~ d A/"(0,C), Z fe>e ~ d A/"(0, 1) and the 
noise standard deviations n^e are known. This corresponds to the observation of the same signal X by E independent 
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experiments, the observations being tainted by independent but heteroscedastic errors. Let Y& := [¥fc, e ](ee[i,#] the 
vector of observations at point (or index in a general framework) fc, and let Y := (\Yk]ke[i,N pix ]) T be the full vector 
of observations. The covariance matrix of Y& is := 11 T C + where := diag(n| e ) e e[i,E] an d 1 is the E x 1 
vector of ones. By independence of the Y^'s, the negative log-likelihood of C given Y thus writes 

£(C):=-21og(P(Y|C)) = -2 log (P(Y k \C)) 

= ^Y^R^Yfc + logdetRfc. 

Denote 

h k := (l r N fe -4)- 1/2 = (E e Ke)" 2 )" 1/2 . (CI) 
It is immediate to check the following identity which will be used below: 

Define R/e := YfcYfc T . The derivative of the negative log-likelihood writes 



= E fe tr (-Y^Rfc^l^Y^+tr (Rfe 1 H T ) 

= ^ fe i T Rfc 1 (rtk - Rfc) Rfe x i 

l^N" 1 (R fc -R fc ) N" 1 ! 



(1 + Cn2) 

(l^N-1) 2 „ (l^N-Y,) 2 -!^ 1 ! 



=c y -fc tL _y 

^ k (I + Chi? ^ k 



2 



(1 + CnJ)- ^* (1 + Cn%) 

fc; z^ fe (C + nf) 2 

It follows that the likelihood is maximized for 

C = C(w) := y k W k (C,-N) [(n 2 l T N^Y fe ) 2 - 

with 



(C, N) := (C + hi) " 2 (C + ft?) - 2 ] _1 . (C2) 



As the optimal weights depend on C, this only defines implicitly the ML estimator. For some approximate spectrum 
C°, the proposed explicit NSE is given by C(wk) with = Wk(C° , N). 

Particular case of a single experiment 

In the particular case of a single experiment (E = 1) with heteroscedastic noise, following the model 

Yk = X k + n/eZ fc , 

the likelihood is maximized for 

C = CW := £ fe «;*((?, N) (y fe 2 - n\) (C3) 



18 



with Wk(C : N) defined by Eq. (C2), and again, assuming that is poorly sensitive to C, the NSE is C (wk(C )) for 
some approximate spectrum C . 
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